Overview

This session is about the basics of stochastic actor-oriented models (SAOMs). We refer to the RSiena page https://www.stats.ox.ac.uk/~snijders/siena/ for further resources. Here we will

  • Introduce SAOM as a simulation model
  • Present the basic functions and functionalities
  • Briefly discuss an empirical example

It is appropriate that we simulate the model as Siena stands for

Simulation Investigation for Empirical Network Analysis

Packages

We will use functionality from the network packages sna and network (see https://raw.githubusercontent.com/johankoskinen/CHDH-SNA/main/Markdowns/CHDH-SNA-2.Rmd). The main packages for SAOM is RSiena.

R

First time you use them, install the packages (uncomment the install commmands)

# install.packages("sna")
# install.packages("network")
# install.packages("RSiena")

Once packages are installed, load them

library("sna")
library("network")
library('RSiena')

Data set

The RSiena package automatically loads the van de Bunt’s Freshman dataset (see Data description). We will use time-point 3 and 4

?tmp3# opens the helpfile with information about the dataset
class(tmp3)# both time-points are regular adjacency matrices
class(tmp4)
dim(tmp3)# 32 students by 32
n <- dim(tmp4)[1]

Data checking

The original tie-variables had 6 levels but have here been dichotomised. It is good practice to check that ties are binary

table(tmp3,useNA = 'always')
table(tmp4,useNA = 'always')

Missing

RSiena handles missing data in estimation, but for the purposes of simulating and investigating the network, replace missing values

tmp3[is.na(tmp3)] <- 0 # remove missing
tmp4[is.na(tmp4)] <- 0 # remove missing

Plot change

Plotting the networks with nodes in the same places illustrates what the SAOM will try to model

par(mfrow = c(1,2))
coordin <- plot(as.network(tmp3), main='Time 3')
plot(as.network(tmp4),coord=coordin, main='Time 4')

Looking closely we see that some arcs have been added and others been removed

In this session, we will assume that we have one initial network \(X(t_0)=x_{obs}(t_0)\), at time \(t_0\) and that we want to say something about a network \(X(t_1)\), at time \(t_1\), \(t_0<t_1\).

We will only use two waves but because of the Markov property of the model, all ingredients extend without limitations to several waves.

Format data

To simulate (but also estimate) we need to execute, in turn, each of the functions

  1. sienaDependent - formats class matrix to class sienaDependent
  2. sienaDataCreate - formats data to class siena
  3. getEffects - provides us with the effects we can use for the process
  4. sienaAlgorithmCreate - sets up simulation/estimation settings

After these steps, the model is simulated/estimated using siena07

sienaDependent

The function sienaDependent formats and translates the two adjacency matrices to a “sienaDependent” object:

mynet1 <- sienaDependent(array(c(tmp3, tmp4), # the matrices in order
                               dim=c(32, 32,2))# are set as two slices in an array
                         )

For networks, sienaDependent takes networks clued together in an \(n \times n \times\) number of waves, array.

You can define, as dependent variables, one-mode networks, two-mode networks, or dependent mondadic variables

sienaDataCreate

Once you have defined your variables, these are combined into a siena object using sienaDataCreate

mydata <- sienaDataCreate(mynet1)

The siena object adds all relevant information about the data

mydata
## Dependent variables:  mynet1 
## Number of observations: 2 
## 
## Nodeset                  Actors 
## Number of nodes              32 
## 
## Dependent variable mynet1   
## Type               oneMode  
## Observations       2        
## Nodeset            Actors   
## Densities          0.15 0.18

Simulate

getEffects

To determined what effects are available for our combination of different types of data

myeff <- getEffects(mydata)

Assume a model where an actor \(i\), when they make a change, only cares about how many ties they have.

myeff <- includeEffects(myeff, recip,include=FALSE)# remove reciprocity which is DEFAULT
## [1] effectName   include      fix          test         initialValue
## [6] parm        
## <0 rows> (or 0-length row.names)
myeff$initialValue[myeff$shortName=='Rate'] <- 5.7288
myeff$initialValue[myeff$shortName=='density'][1] <- -0.7349

For later reference, notice how myeff is on the right-hand side of the allocation of includeEffects

Waiting times

What does the rate \(\lambda = 5.7288\) mean?

Each time the network has changed (or an oppotunity to change), each actor waits \(T_i \overset{i.i.d.}{\thicksim} Exp(\lambda)\) time.

The actor that gets to change is the one who waits for the shortest amount of time

waiting <- rexp(32, 5.7288)
hist(waiting)

winner <- which( waiting == min(waiting))
paste('The winner is actor: ', winner)
## [1] "The winner is actor:  29"

Micro-step

In the example of \(i=1\), deciding between creating a tie to \(j=2,4\) or breaking the tie to \(j=3\)

Micro-step

With our current model

\[ \Pr(X(1\leadsto 2)=\Pr(X(1\leadsto 4)=\frac{e^{\beta_1(1-2x_{12})}}{1+\sum_{j\neq i} e^{\beta(1-2x_{ij})}}=\frac{\exp(-0.7349)}{\exp(-0.7349)+\exp(0.7349)+\exp(-0.7349)+1} \] and

\[ \Pr(X(1\leadsto 3)=\frac{e^{\beta_1(1-2x_{13})}}{1+\sum_{j\neq i} e^{\beta(1-2x_{ij})}}=\frac{\exp(0.7349)}{\exp(-0.7349)+\exp(0.7349)+\exp(-0.7349)+1} \] The conditional probability of \(i=1\) creating the tie to \(j=2\) is thus

exp(-0.7349)/(exp(-0.7349)+exp(0.7349)+exp(-0.7349)+1)
## [1] 0.1185728

and the conditional probability of \(i=1\) deleting the tie to \(j=3\) is thus

exp(0.7349)/(exp(-0.7349)+exp(0.7349)+exp(-0.7349)+1)

sienaAlgorithmCreate

The function sienaAlgorithmCreate determines the simulation/estimation settings. Here we are going to

  • Simulate simOnly = TRUE
  • a total of n3 = 100

Networks, starting in \(X(t_3)\)

sim_model  <-  sienaAlgorithmCreate( 
                          projname = 'sim_model',# name will be appended to output 
                          cond = FALSE, # NOT conditioning on num. changes
                          useStdInits = FALSE,# we are changing some defaults
                          nsub = 0 ,# more about subphases in estimation
                          n3 = 100,# number of simulations (in Phase 3)
                          simOnly = TRUE)# we only want to simulate

We are ready to simulate!

siena07

The function siena07 is the main engine of RSiena and is used to estimate all models (apart from hierarchical SAOMS)

sim_ans <- siena07( sim_model,# the name of our model
                    data = mydata,# all of our data - see above for what is in there
                    effects = myeff,# the effects we have chosen, including parameters
                    returnDeps = TRUE,# save simulated networks
                    batch=TRUE )# batch=FALSE opens a GUI

Read in a function

The networks are in sim_ans$sims but to help with extracting and formatting them we read in the function reshapeRSienaDeps

source("https://raw.githubusercontent.com/johankoskinen/CHDH-SNA/main/data/reshapeRSienaDeps.R")

We apply it on sim_ans

mySimNets <- reshapeRSienaDeps(sim_ans,n)# n is the number of nodes (defined above)

The object mySimNets is an 100 by 32 by 32 array with 9 adjacency matrices

Plot simulated networks

Plot networks with the same layout as for \(X(t_3)\) above

par(mfrow=c(2,5),# want 2 by 5 panels
    oma = c(0,1,0,0) + 0.1,# custom outer margins
    mar = c(1,0,0,0) + 0.1)# custom inner margins
plot(as.network(tmp4),# observed network at t4
     coord=coordin) # observed network
invisible(# need to supress printing to consol
  apply(mySimNets[1:9,,],# for the first 9 networks in array
      1,# along first dimenstion, apply the following function
      function(x)
        plot(as.network(x),coord=coordin) ) )

It is hard to spot any differences. Plot density of the networks

hist( gden(mySimNets ) )
abline( v = gden(tmp4), col='red')

If actors only care about how many ties they have, we predict the density at \(t_4\) correctly

How about the number of reciprocated ties \(x_{ij}x_{ji}=1\)

hist( dyad.census(mySimNets)[,1] ,xlim=c(9,50))
abline(v = dyad.census(tmp4)[1], col='red')

Clearly, if the only rule actors have is to care about the number of ties they have, this is not enough to explain why there are so many (46) reciprocal dyads

Reciprocity model

Simulate assuming that actors want to have reciprocated ties to others with the model with probabilities \(\Pr(X(i \leadsto j) |i \text{ makes decision})\): \[ p_{ij}(X,\beta)=\frac{e^{\beta_d (1-2x_{ij}) +\beta_r (1-2x_{ij})x_{ji} } }{ \sum_h e^{\beta_d (1-2x_{ih}) +\beta_r (1-2x_{ih})x_{hj} } } \]

with parameters \(\beta_d=-1.1046\) and \(\beta_r=-1.2608\):

myeff <- includeEffects(myeff, recip,include=TRUE)# reinstate reciprocity which is DEFAULT
##   effectName  include fix   test  initialValue parm
## 1 reciprocity TRUE    FALSE FALSE          0   0
myeff$initialValue[myeff$shortName =='recip'][1] <- 1.2608
#### === We also need to change the other values
myeff$initialValue[myeff$shortName=='Rate'] <- 6.3477
myeff$initialValue[myeff$shortName=='density'][1] <- -1.1046

Log odds for a new graph

\(x_{ji}\)
\(x_{ij}\) 0 1
0 0 \(\beta_d\)
1 \(\beta_d\) \(\beta_d+\beta_r\)

Simulate

Set up the algorithm

sim_model  <-  sienaAlgorithmCreate( 
  projname = 'sim_model',
  cond = FALSE, 
  useStdInits = FALSE,
  nsub = 0 ,
  n3 = 100,
  simOnly = TRUE)

Run simulation

sim_ans <- siena07( sim_model, data = mydata,
                    effects = myeff,
                    returnDeps = TRUE, batch=TRUE )

Reshape and extract networks:

mySimNets <- reshapeRSienaDeps(sim_ans,n)

Investigate simulated networks

Plot, using the same code as before

par(mfrow=c(2,5),# want 2 by 5 panels
    oma = c(0,1,0,0) + 0.1,# custom outer margins
    mar = c(1,0,0,0) + 0.1)# custom inner margins
plot(as.network(tmp4),# observed network at t4
     coord=coordin) # observed network
invisible(# need to supress printing to consol
  apply(mySimNets[1:9,,],# for the first 9 networks in array
      1,# along first dimenstion, apply the following function
      function(x)
        plot(as.network(x),coord=coordin) ) )

Compare the observed number of reciprocated dyads to the simulated number of reciprocated dyads

hist( dyad.census(mySimNets)[,1])
abline(v = dyad.census(tmp4)[1], col='red')

With an actor preference of 1.26 for having their ties reicprocated we reproduce the reciprocity

What about the triad cencus? Check the first 9

triad.census(tmp4)# observed
triad.census(mySimNets[1:9,,])# first 9 simulated networks

Look at the distributions for transitive triads and dense triads

par(mfrow=c(1,2))
hist( triad.census(mySimNets)[,9], xlim=c(8,40))# column 9 is 030T
abline(v = triad.census(tmp4)[9], col='red')
hist( triad.census(mySimNets)[,16], xlim=c(4,24))# column 16 is 300
abline(v = triad.census(tmp4)[16], col='red')

A prefence for reciprocity is not enough to explain why reciprocated dyads hang together in triangles or why there are so many transitive triads

Triangle model

Add another positive parameter \(\beta_t\) for the preference for closing open triads.

myeff <- includeEffects(myeff, recip,include=TRUE) # add the effect reciprocated ties
myeff <- includeEffects(myeff, transTrip,include=TRUE)
myeff$initialValue[myeff$shortName=='Rate'] <- 7.0959
myeff$initialValue[myeff$shortName=='density'][1] <- -1.6468
myeff$initialValue[myeff$shortName=='recip'][1] <- 0.8932
myeff$initialValue[myeff$shortName=='transTrip'][1] <- 0.2772

Simulate

Set up the algorithm

sim_model  <-  sienaAlgorithmCreate( 
  projname = 'sim_model',
  cond = FALSE, 
  useStdInits = FALSE,
  nsub = 0 ,
  n3 = 100,
  simOnly = TRUE)

Run simulation

sim_ans <- siena07( sim_model, data = mydata,
                    effects = myeff,
                    returnDeps = TRUE, batch=TRUE )

Reshape and extract networks:

mySimNets <- reshapeRSienaDeps(sim_ans,n)

What about the triad cencus? Check the first 9

triad.census(tmp4)# observed
triad.census(mySimNets[1:9,,])# first 9 simulated networks

Look at the distributions for transitive triads and dense triads

par(mfrow=c(1,2))
hist( triad.census(mySimNets)[,9], xlim=c(2,55))# column 9 is 030T
abline(v = triad.census(tmp4)[9], col='red')
hist( triad.census(mySimNets)[,16], xlim=c(4,86))# column 16 is 300
abline(v = triad.census(tmp4)[16], col='red')

Having preferences for reciprocated ties and transitive ties seem to explain ‘most’ of the structure

Estimate SAOMs

How did I pick the numbers

  • \(\beta_d=-1.6468\)
  • \(\beta_r=-0.8932\)
  • \(\beta_t=0.2772\)

Manually, you could

  1. pick values \(\beta_d^{\ast}\), \(\beta_r^{\ast}\), and \(\beta_t^{\ast}\)
  2. simulate lots of networks, and
  3. adjust the parameters, so that
  • increase (decrease) \(\beta_d^{\ast}\) if density is too low (high)
  • increase (decrease) \(\beta_r^{\ast}\) if reciprocity is too low (high)
  • increase (decrease) \(\beta_t^{\ast}\) if transitivity is too low (high)
  1. repeat

Luckily, we have an algorithm (stochastic approximation) for automating this

Defining model

We define the model in the same way as when we simulated data

myeff <- getEffects(mydata)# We already have our effects, but let's start from scratch
myeff <- includeEffects(myeff, recip,include=TRUE) # add the effect reciprocated ties (it is alredy include by DEFAULT though)
myeff <- includeEffects(myeff, transTrip,include=TRUE)# we want to estimate the trasitivity effect

Set up the algorithm with default values (siena07 will assume you want to estimate)

est_model  <-  sienaAlgorithmCreate( 
  projname = 'est_model',
  n3 = 1000,# number of simulations in Phase 3 (default *is* 1000)
  simOnly = FALSE,# default *is* FALSE
  cond = FALSE, 
  useStdInits = FALSE)

Estimate!

est_ans <-siena07(est_model,# algorithm object
                  data = mydata, # the data object we created earlier
                  effects = myeff,# same effects as in simulation
                  returnDeps = TRUE,
                  batch=TRUE)

Estimation gives us an ANOVA table with - Parameter (point) estimates - Standard errors of estimates - Convergence statistis

summary( est_ans )
## Estimates, standard errors and convergence t-ratios
## 
##                                       Estimate   Standard   Convergence 
##                                                    Error      t-ratio   
##   1. rate basic rate parameter mynet1  7.0328  ( 0.9465   )   -0.0318   
##   2. eval outdegree (density)         -1.6409  ( 0.1302   )    0.0058   
##   3. eval reciprocity                  0.8943  ( 0.2109   )   -0.0109   
##   4. eval transitive triplets          0.2746  ( 0.0431   )   -0.0315   
## 
## Overall maximum convergence ratio:    0.0921 
## 
## 
## Total of 2318 iteration steps.
## 
## Covariance matrix of estimates (correlations below diagonal)
## 
##        0.896        0.004        0.010       -0.009
##        0.032        0.017       -0.010       -0.004
##        0.048       -0.368        0.044       -0.001
##       -0.225       -0.628       -0.134        0.002
## 
## Derivative matrix of expected statistics X by parameters:
## 
##       11.672        5.133        4.124       50.798
##       67.740      217.080      144.440     1192.800
##       16.458       71.368       81.662      441.285
##      191.972      592.138      461.720     4590.234
## 
## Covariance matrix of X (correlations below diagonal):
## 
##      127.313      112.444       78.146      808.183
##        0.542      338.506      249.928     2159.994
##        0.449        0.881      237.609     1755.613
##        0.554        0.908        0.881    16709.070

These estimates look very much like the numbers that I picked arbitrarily - what makes these better

The observed statistics that we want to ‘hit’ on average are

est_ans$targets
## [1] 125 175  92 431
# the same as sim_ans$targets

The simulated statistics for the parameters from the estimation are

colMeans(est_ans$sf2)
##         [,1]    [,2]   [,3]    [,4]
## [1,] 124.641 175.107 91.832 426.925
# also provided in:
# est_ans$estMeans

The simulated statistics for the parameters from the simulation are

sim_ans$estMeans
## [1] 125.64592 174.72228  91.03856 429.86593

We can calculate within how many standard deviation units of the targets we are for each statistic \[ \frac{\bar{z}_k-z_{obs}}{sd(z_k)} \] The deviations using the estimates from estimation:

(colMeans(est_ans$sf2)-est_ans$targets)/sqrt(diag(var(est_ans$sf2[,1,])))
##             [,1]        [,2]        [,3]        [,4]
## [1,] -0.03181686 0.005815681 -0.01089877 -0.03152474
# Also provided in:
# est_ans$tstat

For estimates from simulation:

(colMeans(sim_ans$sf2)-sim_ans$targets)/sqrt(diag(var(sim_ans$sf2[,1,])))

For both sets of parameters, the simulated statistics are on average within less that 0.1 standard deviations units of the observed statistics

As a quick illustration, we can see when we set rate, density, and reciprocity to the estimated values

myeff$initialValue[myeff$shortName=='Rate'] <- est_ans$theta[1]
myeff$initialValue[myeff$shortName=='density'][1] <- est_ans$theta[2]
myeff$initialValue[myeff$shortName=='recip'][1] <-est_ans$theta[3]

but pick another value for transitivity:

myeff$initialValue[myeff$shortName=='transTrip'][1] <- 0.1

and then, simulate!

sim_model  <-  sienaAlgorithmCreate( 
  projname = 'sim_model',
  cond = FALSE, 
  useStdInits = FALSE,
  nsub = 0 ,
  n3 = 1000,
  simOnly = TRUE)
sim_ans <- siena07( sim_model, data = mydata,
                    effects = myeff,
                    returnDeps = TRUE, batch=TRUE )

Calculate the scaled difference between the average statistics and the observed statistics again

(colMeans(sim_ans$sf2)-sim_ans$targets)/sqrt(diag(var(sim_ans$sf2[,1,])))

The everage simulated statistics:

sim_ans$estMeans

are not close to the observed

sim_ans$targets

Decreasing the transitivity parameter results in networks having too few transitive triads

Convergence check

In general, we say that the estimation has converged and estimates can be estimated if

  1. Convergence t-ratios \(| t_{conv} | < 0.1\), and
  2. Overall convergence ration less than \(0.25\)

Summary of estimation

The aim of the estimation algorithm is to find estimates \(\hat{\theta}\), such that \[ \underbrace{E_{\hat{\theta}}\{ Z[X(t_1)] \mid X(t_0)=x_{obs}(t_0)\}}_{\text{'average' statistics}}=Z[x_{obs}(t_1)] \] The stochastic approximation algorithm in siena07 solves this equation computationally in three Phases:

  1. Phase 1: Determining how big a change you get in \(E_{\theta}\{ Z(X) \mid x_{obs}(t_0)\}\), to calibrate updates to \(\theta\). This phase determines \(D\)
  2. Phase 2: The main estimation phase where you incrementally update the current values \(\theta^{(r)}\) to \(\theta^{(r+1)}\) \[ \theta^{(r+1)} = \theta^{(r)} - a_r D^{-1}(z_r-z_{obs}) \] where \(z_r=z(X_r)\), and \(X_r\) is simulated from \(x_{obs}(t_0)\) with parameter values $ ^{(r)}$. The final values (in practice average values over a subphase) are the estimates \(\hat{\theta}\)
  3. Phase 3: A large number of networks \(x^{(1)},\ldots,x^{(n_3)}\) are simulated using \(\hat{\theta}\) to calculate \(\bar{z}\) and \(sd(z)\), in order to calculate convergence t-ratios. In addition, the standard errors \(se(\hat{\theta})\) of the estimates \(\hat{\theta})\) are calculated.

You may notice that the difference that is minimised in this algoritm is not \(\bar{z}-z_{obs}\). Only one network is simulated in each interation - but it works (the other way also works but is less efficient)

Standard errors

The second column in the ANOVA table contains the nominal standard errors, i.e. the approximate standard deviations of the estimators of the \(\theta\)’s. Typically, these are used for standard hypothesis tests:

For effect/parameter \(k\), test \[ H_0:\beta_k=\beta_{k,0}=0 \] against \[ H_0:\beta_k\neq 0 \] The test statistic \[ \frac{\hat{\beta}_k-\beta_{k,0} }{sd(\hat{\beta}_k)}=\frac{\hat{\beta}_k }{sd(\hat{\beta}_k)}\approx \frac{\hat{\beta}_k }{se(\hat{\beta}_k)}, \] is approximately normally distributed \(N(0,1)\), if \(H_0\) is true.

Given the number of assumptions and approximations we need to make, I would advice that we stick to testing on the nominal 5% level, and reject \(H_0\) when the test statistic is greater than \(2\) in absolute value.

Test of simple model

In the simple model we estimated, let us test \(H_0:\beta_t=0\). The observed test statistic

est_ans$theta[4]/est_ans$se[4]
## [1] 6.370043

is considerably larger than 2, and we can reject the null-hypothesis that the true value of the transitivity parameter is \(0\).

Score-type test

Intuitively, we could also test if we ‘need’ \(\beta_t\), by estimating the model with \(\beta_t=0\), and check the convergence t-statistic. You can do this yourself now!

Testing attribute

There are many different types of covariates

Type RSiena type
Constant monadic covariates coCovar
Changing monadic covariates varCovar
Constant dyadic covariate coDyadCovar
Changing dyadic covariate varDyadCovar
Changing (covariate) network sienaDependent

The usual nomenclauture for measurement scales, and the distinction between metric and non-metric variables, applies.

Model with attributes

The adjacency matrices s501, s502, and s503, for the so-called s-50 dataset, the network of 50 girls, are also available with RSiena. For a full description of this dataset see ?s50 or http://www.stats.ox.ac.uk/~snijders/siena/s50_data.htm

Rename adjacency matrices

For clarity, we rename the adjacency matrices

friend.data.w1 <- s501
friend.data.w2 <- s502
friend.data.w3 <- s503

Note that we have three (3) waves.

Among the many monadic covariates are ‘smoking’ and ‘drinking’:

drink <- s50a
smoke <- s50s

Here, each variable had its own \(n \times 3\) data matrix, one observation for each individual and each time-point

head(smoke)

We are going to test \[ H_0: \text{smokers have as many friends as non-smokers} \] Against an alternative hypothesis stating that there is a difference. We will interpret this as “sending as many ties” and “receiving as many ties” as non-smokers.

Pogues

Formatting covariates

We will use smoking at \(t_0\) as our explanatory variable and define ‘smoking’ as a value of 2 or 3.

smoke1.matrix <- as.numeric( smoke[,1]>1 )
table(smoke1.matrix, useNA='always')

To tell RSiena how to use this variable, we format it

smoke1 <- coCovar( smoke1.matrix )

Print to screen to see how R has interpreted the variable

smoke1

Format DP

Format the dependent network variable as before:

friendshipData <- array( c( friend.data.w1, 
                            friend.data.w2, 
                            friend.data.w3 ),
                         dim = c( 50, 50, 3 ) )
friendship <- sienaDependent(friendshipData)

sienaDataCreate

Using sienaDataCreate, we give RSiena the oppotunity to figure out how data are structured and what types of effects can be calculated from it

s50data <- sienaDataCreate( friendship, smoke1)

getEffects

Since we have both a network as well as monadic covariates, we will have more effects avaialble to us that previously

s50.effects <- getEffects(s50data)

Sender effect

For testing our hypothesis we want a statistic that corresponds to \[ v_i x_{ij} \] for each tie-variable, and where \(v_i\) is one or zero according to whether \(i\) is a smoker or not, respectively. This is the sender effect

s50.effects <- includeEffects( s50.effects,# we "add and effect" to our effects object
                               egoX,# the shortname here evokes that variable for 'ego' affects decision
                               interaction1 = "smoke1" # the variable we want to interact the DV with
                               )

Receiver effect

Next, we want a statistic that corresponds to \[ v_j x_{ij} \] so that if the corresponding parameter is positive, then actors are more likely to form (maintain) ties to actors \(j\) that have \(v_j=1\), i.e. are smokers

s50.effects <- includeEffects( s50.effects,
                               altX,# the shortname here evokes that variable for 'alter' affects decision of ego
                               interaction1 = "smoke1" # the variable we want to interact the DV with
                               )

Transitivity

For the vanDeBunt dataset, we saw that triadic closure was important. We can chose to include it because we believe that it is important but also as a control for our hypothesised effects

s50.effects <- includeEffects(s50.effects,# We "add and effect" to s50.effects
                              transTrip,# shortname of the effect
                              include=TRUE)

Specified model

Our specified model is

s50.effects
##   effectName                          include fix   test  initialValue parm
## 1 constant friendship rate (period 1) TRUE    FALSE FALSE    4.69604   0   
## 2 constant friendship rate (period 2) TRUE    FALSE FALSE    4.32885   0   
## 3 outdegree (density)                 TRUE    FALSE FALSE   -1.46770   0   
## 4 reciprocity                         TRUE    FALSE FALSE    0.00000   0   
## 5 transitive triplets                 TRUE    FALSE FALSE    0.00000   0   
## 6 smoke1 alter                        TRUE    FALSE FALSE    0.00000   0   
## 7 smoke1 ego                          TRUE    FALSE FALSE    0.00000   0

sienaAlgorithmCreate

Specify the simulation settings

s50algorithm.simple <- sienaAlgorithmCreate( projname = 's50_simple' )# We are only using defaults

Estimate

Estimating the model with default settings is simply a callt to siena07

s50.simple.ans <- siena07( s50algorithm.simple,# estimation settings
                  data = s50data,# data object that knows DV and IV
                  effects = s50.effects,# the effects we specified
                  batch = TRUE,
                  returnDeps = TRUE )

Results

Print the results to screen

summary( s50.simple.ans )
## Estimates, standard errors and convergence t-ratios
## 
##                                    Estimate   Standard   Convergence 
##                                                 Error      t-ratio   
## 
## Rate parameters: 
##   0.1      Rate parameter period 1  6.4725  ( 1.0404   )             
##   0.2      Rate parameter period 2  5.2730  ( 0.8831   )             
## 
## Other parameters: 
##   1.  eval outdegree (density)     -2.6889  ( 0.1229   )   -0.1023   
##   2.  eval reciprocity              2.4580  ( 0.2056   )   -0.0845   
##   3.  eval transitive triplets      0.6245  ( 0.0756   )   -0.0161   
##   4.  eval smoke1 alter            -0.0157  ( 0.1734   )    0.0570   
##   5.  eval smoke1 ego               0.0810  ( 0.1817   )    0.0488   
## 
## Overall maximum convergence ratio:    0.1624 
## 
## 
## Total of 2220 iteration steps.
## 
## Covariance matrix of estimates (correlations below diagonal)
## 
##        0.015       -0.016       -0.004        0.001       -0.001
##       -0.614        0.042       -0.002        0.000        0.003
##       -0.429       -0.158        0.006       -0.001        0.000
##        0.027       -0.002       -0.082        0.030       -0.007
##       -0.032        0.068       -0.025       -0.208        0.033
## 
## Derivative matrix of expected statistics X by parameters:
## 
##      281.414      223.884      489.690        2.432        2.305
##      127.684      138.064      259.263        0.219        0.072
##      351.990      325.050     1169.890       15.901       11.704
##        4.308        4.487       18.212       41.115       20.689
##        2.631       -0.275        9.060       25.283       41.266
## 
## Covariance matrix of X (correlations below diagonal):
## 
##      466.522      402.905     1135.459        8.084        8.922
##        0.927      405.345     1104.583        7.262        7.054
##        0.805        0.841     4259.473       43.032       37.970
##        0.049        0.047        0.087       57.886       45.160
##        0.054        0.046        0.077        0.782       57.635

Are all convergence statistics are less than 0.1 in absolute value? If so we can test our hypothesis - do we reject our \(H_0\)?

Adding homophily

To account for (test) possible assortativity on smoking, we add the homophily effect:

s50.effects <- includeEffects( s50.effects,# we "add and effect" to our effects object
                               egoXaltX,# the shortname
                               interaction1 = "smoke1" # the variable we want to interact the DV with
                               )

Re-estimate

s50.hom.ans <- siena07( s50algorithm.simple,# estimation settings
                  data = s50data,# data object that knows DV and IV
                  effects = s50.effects,# the effects we specified
                  batch = TRUE,
                  returnDeps = TRUE )

Results

Print the results to screen

summary( s50.hom.ans )
## Estimates, standard errors and convergence t-ratios
## 
##                                      Estimate   Standard   Convergence 
##                                                   Error      t-ratio   
## 
## Rate parameters: 
##   0.1      Rate parameter period 1    6.4748  ( 1.1283   )             
##   0.2      Rate parameter period 2    5.2921  ( 0.8601   )             
## 
## Other parameters: 
##   1.  eval outdegree (density)       -2.6975  ( 0.1106   )   0.0210    
##   2.  eval reciprocity                2.4441  ( 0.1885   )   0.0395    
##   3.  eval transitive triplets        0.6113  ( 0.0786   )   0.0436    
##   4.  eval smoke1 alter              -0.0737  ( 0.1855   )   0.0378    
##   5.  eval smoke1 ego                -0.0158  ( 0.2037   )   0.0296    
##   6.  eval smoke1 ego x smoke1 alter  0.6332  ( 0.3449   )   0.0059    
## 
## Overall maximum convergence ratio:    0.0797 
## 
## 
## Total of 2197 iteration steps.
## 
## Covariance matrix of estimates (correlations below diagonal)
## 
##        0.012       -0.011       -0.004        0.001        0.000       -0.003
##       -0.537        0.036       -0.003       -0.004        0.003        0.001
##       -0.422       -0.230        0.006        0.000       -0.002       -0.002
##        0.068       -0.116        0.022        0.034       -0.014       -0.012
##        0.011        0.082       -0.098       -0.368        0.041       -0.015
##       -0.068        0.009       -0.088       -0.195       -0.213        0.119
## 
## Derivative matrix of expected statistics X by parameters:
## 
##      291.668      225.793      497.884        3.389        4.723       17.481
##      132.402      142.550      265.581        4.666        3.152        8.337
##      348.222      319.782     1149.917       12.115       13.868       28.577
##       10.932       11.249       25.251       46.809       31.274       11.297
##        5.451        2.805       16.632       29.704       42.730        9.994
##       18.161       16.561       43.225       11.069       11.323       13.880
## 
## Covariance matrix of X (correlations below diagonal):
## 
##      469.475      403.896     1138.588       10.821        8.497       36.436
##        0.929      402.607     1095.798       10.707        6.356       33.364
##        0.801        0.832     4304.684       25.151       15.073       99.800
##        0.062        0.067        0.048       64.005       53.121       18.962
##        0.049        0.040        0.029        0.837       62.923       18.636
##        0.379        0.375        0.343        0.534        0.530       19.663

Interpretation

If

  1. Convergence t-ratios \(| t_{conv} | < 0.1\), and
  2. Overall convergence ration less than \(0.25\)

we can test our hypothesis, controlling for possible assortativity/homophily

What about homophily on smoking - do smokers befried other smokers?

Dependent behaviour

We are now going to use drinking alcohol as both a changing covariate and a dependent variable.

sienaDependent

Since the alcohol variable is a dependent variable, we use sienaDependent to format it (print to screen to get a quick view of the information stored)

drinking <- sienaDependent( drink, type = "behavior" )

sienaDataCreate

We now have a dataset with two dependent variables so we need to create a new siena data object using sienaDataCreate:

s50data.infl <- sienaDataCreate( friendship, smoke1, drinking )

getEffects

With an additional variable, additional effects are available (NB:)

s50.effects.infl <- getEffects(s50data.infl)

As covariate

The dependent variable drinking can act as a covariate on the network. We are going to use the ‘complicated’, five-parameter, homophily parametrisation of Lomi and Snijder (2019):

s50.effects.infl <- includeEffects( s50.effects.infl,# again, 'adding' to the effects
                                    egoX,# sender effect for alcohol
                                    egoSqX,# non-linear sender effect
                                    altX,# receiver effect for alcohol
                                    altSqX,# non-linear receiver effect
                                    diffSqX,# squared difference of alcohol ego and alcohol alter
                                    interaction1 = "drinking" # DV works the same as an IV on another DV
                                    )
##   effectName             include fix   test  initialValue parm
## 1 drinking alter         TRUE    FALSE FALSE          0   0   
## 2 drinking squared alter TRUE    FALSE FALSE          0   0   
## 3 drinking ego           TRUE    FALSE FALSE          0   0   
## 4 drinking squared ego   TRUE    FALSE FALSE          0   0   
## 5 drinking diff. squared TRUE    FALSE FALSE          0   0

As a DV

The dependent variable drinking can act as a dependent variable for other variables, including the other dependent variables.

s50.effects.infl <- includeEffects( s50.effects.infl,# still augmenting effects structure
                                    avAlt,# this is the shortname for "average alter"
                                    name="drinking",# name: what is the dependent variable
                                    interaction1 = "friendship" # the network is now "independent" variable
                                    )
##   effectName             include fix   test  initialValue parm
## 1 drinking average alter TRUE    FALSE FALSE          0   0

Other effects

We use the same social selection effects for smoking, the sender effect

s50.effects.infl <- includeEffects( s50.effects.infl,# we "add and effect" to our effects object
                               egoX,# the shortname here evokes that variable for 'ego' affects decision
                               interaction1 = "smoke1" # the variable we want to interact the DV with
                               )

and the receiver effect

s50.effects.infl <- includeEffects( s50.effects.infl,
                               altX,# the shortname here evokes that variable for 'alter' affects decision of ego
                               interaction1 = "smoke1" # the variable we want to interact the DV with
                               )

and the homophily effect

s50.effects.infl  <- includeEffects( s50.effects.infl ,# we "add and effect" to our effects object
                               egoXaltX,# the shortname
                               interaction1 = "smoke1" # the variable we want to interact the DV with
                               )

Structural effects

We will use two (2) different transitivity effects

s50.effects.infl <- includeEffects( s50.effects.infl, transTrip, transRecTrip )
##   effectName                  include fix   test  initialValue parm
## 1 transitive triplets         TRUE    FALSE FALSE          0   0   
## 2 transitive recipr. triplets TRUE    FALSE FALSE          0   0

Estimate

Set up estimation settings:

s50.algo.infl<- sienaAlgorithmCreate( projname = 's50_infl' )

And we can estimate the model (it will take a little longer)

s50.est.infl <- siena07( s50.algo.infl,
                         data = s50data.infl,
                         effects = s50.effects.infl,
                         batch = TRUE,
                         returnDeps = TRUE )

Results

We view the ANOVA table as before:

summary(s50.est.infl)
## Estimates, standard errors and convergence t-ratios
## 
##                                                Estimate   Standard   Convergence 
##                                                             Error      t-ratio   
## Network Dynamics 
##    1. rate constant friendship rate (period 1)  6.3027  ( 0.9675   )    0.0526   
##    2. rate constant friendship rate (period 2)  5.0138  ( 0.8284   )   -0.0700   
##    3. eval outdegree (density)                 -2.7672  ( 0.2373   )    0.0550   
##    4. eval reciprocity                          2.8244  ( 0.3193   )    0.0468   
##    5. eval transitive triplets                  0.8982  ( 0.1577   )    0.0566   
##    6. eval transitive recipr. triplets         -0.5138  ( 0.2533   )    0.0443   
##    7. eval smoke1 alter                         0.1159  ( 0.2352   )    0.0060   
##    8. eval smoke1 ego                          -0.0938  ( 0.2619   )   -0.0082   
##    9. eval smoke1 ego x smoke1 alter            0.3110  ( 0.3684   )    0.0172   
##   10. eval drinking alter                      -0.0876  ( 0.1355   )    0.0251   
##   11. eval drinking squared alter              -0.1345  ( 0.1369   )    0.0207   
##   12. eval drinking ego                         0.0227  ( 0.1233   )   -0.0051   
##   13. eval drinking squared ego                 0.2088  ( 0.1159   )    0.0328   
##   14. eval drinking diff. squared              -0.1079  ( 0.0532   )    0.0410   
## 
## Behavior Dynamics
##   15. rate rate drinking (period 1)             1.3089  ( 0.3279   )   -0.0739   
##   16. rate rate drinking (period 2)             1.7941  ( 0.4726   )   -0.0311   
##   17. eval drinking linear shape                0.4065  ( 0.2180   )   -0.0628   
##   18. eval drinking quadratic shape            -0.5627  ( 0.3290   )    0.0364   
##   19. eval drinking average alter               1.2426  ( 0.8434   )    0.0334   
## 
## Overall maximum convergence ratio:    0.1697 
## 
## 
## Total of 3576 iteration steps.
## 
## Covariance matrix of estimates (correlations below diagonal)
## 
##        0.936       -0.080        0.022       -0.050       -0.022        0.038       -0.016        0.043        0.008        0.013        0.021       -0.008       -0.025        0.005       -0.018       -0.027        0.016       -0.041        0.119
##       -0.100        0.686        0.012       -0.018       -0.028        0.023       -0.015       -0.009        0.010        0.005        0.011        0.009       -0.004        0.003       -0.001       -0.022        0.011        0.020       -0.074
##        0.094        0.060        0.056       -0.049       -0.021        0.028        0.006        0.017        0.002        0.003       -0.001       -0.006       -0.015       -0.004        0.000        0.000       -0.004        0.008       -0.016
##       -0.161       -0.067       -0.645        0.102        0.030       -0.051        0.010       -0.011       -0.002       -0.009       -0.019        0.004        0.016        0.002        0.007        0.009       -0.002        0.000        0.000
##       -0.146       -0.212       -0.550        0.589        0.025       -0.035        0.000        0.001       -0.006       -0.003       -0.003       -0.001        0.002        0.001        0.003        0.001       -0.001       -0.004        0.007
##        0.154        0.112        0.466       -0.632       -0.864        0.064        0.001       -0.002        0.009        0.005        0.000        0.001       -0.001       -0.002       -0.010       -0.002        0.005       -0.001        0.009
##       -0.069       -0.077        0.112        0.138        0.004        0.013        0.055       -0.010       -0.002       -0.016       -0.012        0.003        0.003        0.000        0.005        0.009        0.000        0.000        0.003
##        0.168       -0.042        0.267       -0.135        0.016       -0.033       -0.155        0.069       -0.020        0.002        0.005       -0.015       -0.012       -0.001       -0.004       -0.007       -0.003        0.001       -0.004
##        0.022        0.034        0.018       -0.018       -0.102        0.100       -0.019       -0.207        0.136       -0.001       -0.010        0.002        0.000        0.003       -0.012        0.015        0.005       -0.016        0.040
##        0.101        0.045        0.108       -0.205       -0.148        0.132       -0.493        0.046       -0.018        0.018        0.004       -0.009       -0.002        0.000       -0.003       -0.010       -0.001       -0.001        0.007
##        0.158        0.099       -0.016       -0.426       -0.151        0.005       -0.383        0.134       -0.199        0.217        0.019       -0.001       -0.008        0.001        0.002       -0.009        0.000        0.003       -0.014
##       -0.070        0.088       -0.201        0.111       -0.027        0.019        0.087       -0.473        0.044       -0.558       -0.086        0.015        0.003        0.001        0.005        0.009        0.002       -0.001        0.000
##       -0.227       -0.039       -0.549        0.443        0.124       -0.038        0.101       -0.407        0.010       -0.119       -0.491        0.215        0.013       -0.001       -0.003        0.004        0.001       -0.002        0.008
##        0.098        0.065       -0.284        0.132        0.107       -0.164        0.031       -0.060        0.161       -0.044        0.073        0.095       -0.190        0.003        0.000        0.001        0.001        0.001       -0.001
##       -0.056       -0.003       -0.004        0.064        0.059       -0.117        0.058       -0.042       -0.102       -0.066        0.052        0.130       -0.075       -0.009        0.107        0.014       -0.017        0.005       -0.013
##       -0.059       -0.056        0.003        0.061        0.020       -0.020        0.083       -0.059        0.084       -0.154       -0.134        0.151        0.065        0.026        0.092        0.223       -0.030        0.003        0.002
##        0.075        0.060       -0.077       -0.035       -0.038        0.097        0.005       -0.047        0.057       -0.044        0.008        0.091        0.045        0.054       -0.233       -0.294        0.048       -0.036        0.092
##       -0.128        0.073        0.106       -0.002       -0.083       -0.006        0.001        0.015       -0.129       -0.020        0.056       -0.026       -0.047        0.043        0.046        0.021       -0.506        0.108       -0.263
##        0.146       -0.105       -0.081       -0.002        0.050        0.041        0.014       -0.020        0.129        0.062       -0.121        0.002        0.083       -0.016       -0.046        0.005        0.498       -0.947        0.711
## 
## Derivative matrix of expected statistics X by parameters:
## 
##       11.043        0.000        1.378        0.841        8.235        6.411       -0.786       -0.985       -0.294       -3.048        0.400       -2.673        2.329        1.254       -0.218        0.000       -0.309       -0.661       -1.405
##        0.000       11.147        2.369        2.132       18.813       15.628        0.267        0.067       -0.298        0.343        5.626       -1.022        4.375        3.591        0.000       -0.217       -1.388        0.106       -0.235
##       43.154       26.463      307.573      232.038      647.425      484.535       -0.122        6.303       19.853      -23.508      401.192       -1.291      438.629      507.245       -0.048       -3.875       -0.547       -8.185        2.538
##        8.055        4.287      126.232      134.315      277.665      248.702        2.071        2.612        8.440       -3.127      170.401        2.308      168.681      186.354       -0.314       -1.716       -1.205        1.757        5.744
##       61.221       37.310      402.501      335.965     1455.709     1167.384       22.581       27.287       35.095       37.190      590.542       58.440      609.004      683.061        1.392       -6.073        0.560        1.175        6.531
##       17.017        9.622      189.495      188.328      706.351      640.589        7.561        9.332       14.558        4.649      282.330       16.453      276.293      334.699        1.878       -3.198       -2.603       -2.458        2.040
##        0.429        0.730        2.917        2.615       26.058       18.445       51.039       31.553       10.254       90.203       43.522       74.709       27.623       15.071       -0.678        1.757        5.086        1.464       -0.763
##        0.215       -0.147        8.635        5.710       33.137       23.239       31.733       45.909       11.240       68.536       40.500       79.920       52.990       32.196        0.648       -0.654        0.979        2.935       -0.239
##        5.269        2.159       20.865       17.246       63.473       49.286        9.713       10.822       13.829       15.371       41.663       16.276       43.721        9.839       -0.203       -1.009       -1.164        2.254        1.276
##       -9.940        3.625       14.509       13.179       69.114       42.798       74.864       63.520       15.878      285.722       16.202      232.203       20.666       41.690        2.274        5.441       23.819       -2.587       -4.393
##       44.766       18.927      368.630      309.458      880.572      707.132       35.715       32.230       38.346        8.460      680.578       25.110      629.085      601.033       -4.630       -2.628       -7.289       14.177       22.861
##       -6.501        7.238       46.705       30.093      138.157       83.930       68.038       78.406       20.503      237.429       76.698      314.991       72.613       42.505       -1.656       -2.605        1.682        8.449        1.765
##       70.947       34.647      382.199      257.174      787.913      559.658       24.515       42.249       35.595       -5.446      632.490       13.874      796.275      763.606        1.840       -6.056       -2.309      -19.763       -7.540
##       47.999       26.773      380.639      258.827      792.452      577.637       -6.747       11.075        8.779      -54.585      580.799      -16.526      658.041     1571.445        3.617       -7.002       -9.538      -63.395      -31.106
##        3.262        0.000        1.369        0.204        9.484        7.693        0.325        0.391        1.010       -3.494        9.167       -1.969       -0.103       22.069       14.687        0.000        9.221       -3.746       -2.189
##        0.000        0.265       -1.452        0.050        5.722        6.669       -0.043       -0.286       -0.854        0.979       -0.551       -0.824       -4.667        3.997        0.000       11.274        8.750        0.323        1.059
##       -1.246        2.196      -10.081       -9.953      -22.400      -19.640        1.040        2.437       -0.211       -0.552      -17.347       11.206      -16.691      -25.958        7.321        6.472       54.839       11.283       11.261
##        5.013        4.601        6.097        6.080       52.952       42.383       -3.213       -2.342        1.713      -15.151       10.411       -4.814       16.871      -53.359        1.910       -3.621        1.353      140.885       94.870
##        1.390        2.188        8.491        8.027       32.087       26.484       -2.173       -1.843        0.560      -10.264       14.466       -6.479       14.517       -7.153       -0.517       -2.381       -6.133       50.646       44.849
## 
## Covariance matrix of X (correlations below diagonal):
## 
##      115.996       -9.899       79.752       53.744      271.508      206.030       -0.202        1.460        7.951      -33.269      109.393      -28.144      120.241      143.273       -1.905       -2.090       -7.082        5.909        4.499
##       -0.099       85.946       48.014       29.242      144.802      106.792        3.251        3.436        1.844       19.224       68.152       22.241       70.378       83.111        1.309       -1.853        2.535       -1.694       -3.452
##        0.323        0.226      524.045      413.245     1407.661     1078.773       12.670       16.341       39.466       -0.101      730.984       25.819      764.078      882.547       -2.066       -4.805       -4.169       10.611       33.865
##        0.255        0.161        0.922      383.040     1202.045      983.011       11.892       12.455       31.476        9.591      586.427       18.028      594.380      670.009       -2.080       -2.274       -4.518        7.152       31.792
##        0.337        0.209        0.823        0.822     5583.581     4447.280       65.915       84.618      127.713      123.029     2109.839      185.829     2141.717     2493.800       -4.046       -4.514       -1.140       24.560      119.354
##        0.314        0.189        0.773        0.824        0.976     3716.927       49.669       61.176       97.342       84.427     1635.193      121.416     1619.715     1880.945       -4.689       -0.571       -0.495       17.956      102.543
##       -0.002        0.040        0.064        0.070        0.101        0.094       75.756       59.041       19.238      143.100       77.583      132.755       57.233       52.285        2.265        1.753        8.471       -1.686       -2.279
##        0.016        0.044        0.085        0.076        0.135        0.119        0.806       70.841       20.163      125.094       73.843      141.214       77.325       58.147        1.497       -0.440        6.348        1.326        0.213
##        0.166        0.045        0.387        0.361        0.383        0.358        0.496        0.537       19.896       31.308       75.422       34.424       78.596       33.799       -0.377       -1.565       -1.087        1.093        4.016
##       -0.135        0.091        0.000        0.021        0.072        0.061        0.718        0.649        0.307      523.687       17.678      450.789        3.005       36.024        5.045        5.393       24.569       -3.796      -14.738
##        0.268        0.194        0.841        0.790        0.744        0.707        0.235        0.231        0.446        0.020     1440.359       71.746     1337.457     1487.742        1.031       -4.798      -11.691       -1.172       46.373
##       -0.110        0.101        0.047        0.039        0.105        0.084        0.641        0.706        0.325        0.828        0.080      565.428       46.365       89.319        9.958        2.527       38.205        2.973       -3.748
##        0.280        0.191        0.838        0.762        0.719        0.667        0.165        0.231        0.442        0.003        0.884        0.049     1587.958     1549.701      -12.394      -15.379      -15.374       28.905       59.752
##        0.221        0.149        0.640        0.568        0.554        0.512        0.100        0.115        0.126        0.026        0.650        0.062        0.645     3632.938       24.237        8.268        3.413      -80.817       -3.724
##       -0.038        0.031       -0.020       -0.023       -0.012       -0.017        0.056        0.039       -0.018        0.048        0.006        0.091       -0.067        0.087       21.292        0.049       14.777       -0.913       -3.150
##       -0.039       -0.041       -0.043       -0.024       -0.012       -0.002        0.041       -0.011       -0.071        0.048       -0.026        0.022       -0.078        0.028        0.002       24.210       13.244       -1.791       -0.952
##       -0.070        0.029       -0.020       -0.025       -0.002       -0.001        0.104        0.081       -0.026        0.115       -0.033        0.172       -0.041        0.006        0.343        0.288       87.126        9.976       18.234
##        0.039       -0.013        0.033        0.026        0.023        0.021       -0.014        0.011        0.017       -0.012       -0.002        0.009        0.051       -0.095       -0.014       -0.026        0.076      198.574      132.932
##        0.032       -0.028        0.112        0.123        0.121        0.128       -0.020        0.002        0.068       -0.049        0.093       -0.012        0.114       -0.005       -0.052       -0.015        0.148        0.716      173.638